Model Selection for High-Dimensional Regression under the Generalized Irrepresentability Condition

نویسندگان

  • Adel Javanmard
  • Andrea Montanari
چکیده

In the high-dimensional regression model a response variable is linearly related to p covariates, but the sample size n is smaller than p. We assume that only a small subset of covariates is ‘active’ (i.e., the corresponding coefficients are non-zero), and consider the model-selection problem of identifying the active covariates. A popular approach is to estimate the regression coefficients through the Lasso (`1-regularized least squares). This is known to correctly identify the active set only if the irrelevant covariates are roughly orthogonal to the relevant ones, as quantified through the so called ‘irrepresentability’ condition. In this paper we study the ‘Gauss-Lasso’ selector, a simple two-stage method that first solves the Lasso, and then performs ordinary least squares restricted to the Lasso active set. We formulate ‘generalized irrepresentability condition’ (GIC), an assumption that is substantially weaker than irrepresentability. We prove that, under GIC, the Gauss-Lasso correctly recovers the active set.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Asymptotic distribution and sparsistency for `1 penalized parametric M-estimators, with applications to linear SVM and logistic regression

Since its early use in least squares regression problems, the `1-penalization framework for variable selection has been employed in conjunction with a wide range of loss functions encompassing regression, classification and survival analysis. While a well developed theory exists for the `1-penalized least squares estimates, few results concern the behavior of `1-penalized estimates for general ...

متن کامل

High-dimensional classification by sparse logistic regression

We consider high-dimensional binary classification by sparse logistic regression. We propose a model/feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the non-asymptotic bounds for the resulting misclassification excess risk. The bounds can be reduced under the additional low-noise condition. The proposed complexity penalty ...

متن کامل

The Tail Mean-Variance Model and Extended Efficient Frontier

In portfolio theory, it is well-known that the distributions of stock returns often have non-Gaussian characteristics. Therefore, we need non-symmetric distributions for modeling and accurate analysis of actuarial data. For this purpose and optimal portfolio selection, we use the Tail Mean-Variance (TMV) model, which focuses on the rare risks but high losses and usually happens in the tail of r...

متن کامل

On model selection consistency of regularized M-estimators

Penalized M-estimators are used in many areas of science and engineering to fit models with some low-dimensional structure in high-dimensional settings. In many problems arising in machine learning, signal processing, and high-dimensional statistics, the penalties are geometrically decomposable, i.e. can be expressed as a sum of support functions. We generalize the notion of irrepresentability ...

متن کامل

Asymptotic distribution and sparsistency for l1 penalized parametric M-estimators, with applications to linear SVM and logistic regression

Since its early use in least squares regression problems, the l1-penalization framework for variable selection has been employed in conjunction with a wide range of loss functions encompassing regression, classification and survival analysis. While a well developed theory exists for the l1-penalized least squares estimates, few results concern the behavior of l1-penalized estimates for general ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013